How Ideal Are We? Incorporating Human Limitations into Bayesian Models of Word Segmentation

نویسندگان

  • Lisa Pearl
  • Sharon Goldwater
  • Mark Steyvers
چکیده

1. Introduction Word segmentation is one of the first problems infants must solve during language acquisition, where words must be identified in fluent speech. A number of weak cues to word boundaries are present in fluent speech, and there is evidence that infants are able to use many of these, including phonotactics However, with the exception of the last cue, all these cues are language-dependent, in the sense that the infant must know what some of the words of the language are in order to make use of the cue. For example, in order to know what the common stress pattern is for words of her native language, an infant has to know some words already. Since the point of word segmentation is to identify words in the first place, this seems to present a chicken-and-egg problem. Statistical learning has generated a lot of interest because it may be a way out of this problem, by providing an initial language-independent way to identify some words. Since infants appear to use statistical cues earlier than other kinds of cues (Thiessen & Saffran, 2003), statistical learning strategies could indeed be providing an initial bootstrapping for word segmentation. Statistical learning is often associated with transitional probability (Saffran et al., 1996), which has been shown to perform poorly on realistic child-directed speech (calculated over syllables: Gambell & Yang (2006); calculated over phonemes: Brent (1999)). However, a promising alternative approach is Bayesian learning. Researchers have recently shown that Bayesian model predictions are consistent with human behavior in various cognitive domains, including language acquisition (e. (henceforth GGJ) found that Bayesian learners performed very well on the word segmentation problem when given realistic child-directed speech samples, especially when compared to transitional probability learners. One critique of GGJ's model is that it is an " ideal learner " or " rational "

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Learning Mechanisms for Bayesian Models of Word Segmentation

In recent years, Bayesian models have become increasingly popular as a way of understanding human cognition. Ideal learner Bayesian models assume that cognition can be usefully understood as optimal behavior under uncertainty, a hypothesis that has been supported by a number of modeling studies across various domains (e.g., Griffiths and Tenenbaum, Cognitive Psychology, 51, 354–384, 2005; Xu an...

متن کامل

Learning Mechanisms for Bayesian Models of Word Segmentation

In recent years, Bayesian models have become increasingly popular as a way of understanding human cognition. Ideal learner Bayesian models assume that cognition can be usefully understood as optimal behavior under uncertainty, a hypothesis that has been supported by a number of modeling studies across various domains (e.g., Griffiths & Tenenbaum, 2005; Xu & Tenenbaum, 2007). The models in these...

متن کامل

Bayesian word segmentation 1 Running head: BAYESIAN WORD SEGMENTATION A Bayesian Framework for Word Segmentation: Exploring the Effects of Context

Since the experiments of Saffran, Aslin, and Newport (1996), there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we use computational modeling to explore the effects of different assumptions the learner might make regarding the nature of words – in particular, h...

متن کامل

The Utility of Cognitive Plausibility in Language Acquisition Modeling: Evidence From Word Segmentation

The informativity of a computational model of language acquisition is directly related to how closely it approximates the actual acquisition task, sometimes referred to as the model's cognitive plausibility. We suggest that though every computational model necessarily idealizes the modeled task, an informative language acquisition model can aim to be cognitively plausible in multiple ways. We d...

متن کامل

Incorporating Supervision for Visual Recognition and Segmentation

Incorporating Supervision for Visual Recognition and SegmentationbyAlex Yu Jen ShyrDoctor of Philosophy in Electrical Engineering and Computer SciencesUniversity of California, BerkeleyProfessor Michael I. Jordan, Chair Unsupervised algorithms which do not make use of labels are commonly found in computervision and are widely applicable to all problem settings. In the pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009